Listing of Claims: 



This listing of claims will replace all prior versions and listings of claims in the application. 

1 . (Currently Amended) A computer-implemented method of processing outputs of an 
automatic system for probabilistic detection of events, comprising: 

collecting statistics at a training system executing on a computing device, the statistics 
related to observed outputs of the automatic system comprising: 

providing at least one input sequence to the automatic system, the input 
sequence associated with a transcript; 

observing an output sequence of a text generating system executing on a 
computing device of the automatic system generated in response to the provided at least 
one input sequence; and 

comparing the output sequence with the transcript; and 
using the statistics at the training system to process automatically an original output 
sequence of the automatic system and produce an alternate output sequence and a confidence 
assessment regarding parts of at least one of the original output sequence and the alternate 
output sequence , the process including [[by]] at least one of automatically supplementing and 
replacing at least part of the original output sequence with the alternate output sequence in 
accordance with the confidence assessment . 

2. (Previously Presented) A computer-implemented method as recited in claim 1, 
wherein at least part of the alternate output sequence contains information that can be used by 
systems that can use the at least part of the original output sequence directly. 

3. (Previously Presented) A computer-implemented method as recited in claim 2, 
wherein data in the alternate output sequence includes confidence assessments regarding parts 
of at least one of the original and alternate output sequences, where the confidence 
assessments supplement data in the original output sequence. 

4. (Canceled) 

5. (Previously Presented) A computer-implemented method as recited in claim 1 , 
wherein the alternate output sequence includes information of a plurality of alternatives that can 



replace at least part of the original output sequence that can be used by systems that can use 
the at least part of the original output sequence directly. 

6. (Previously Presented) A computer-implemented method as recited in claim 5, 
wherein data in the alternate output sequence includes confidence assessments regarding parts 
of the alternatives, where the confidence assessments supplement data in the original output 
sequence. 

7. (Previously Presented) A computer-implemented method as recited in claim 5, 
wherein data in the alternate output sequence includes confidence assessments regarding parts 
of the alternatives, where the confidence assessments replace at least part of the original output 
sequence. 

8. (Canceled) 

9. (Canceled) 

10. (Previously Presented) A computer-implemented method as recited in claim 1, 
wherein the detected events involve word recognition. 

11. (Previously Presented) A computer-implemented method as recited in claim 10, 
wherein the automatic system is an automatic speech recognition system. 

12. (Previously Presented) A computer-implemented method as recited in claim 11, 
wherein the automatic speech recognition system operates on low-grade audio signals 

having word recognition precision below 50 percent; and 

wherein said method further comprises utilizing human transcription of the low-grade 
audio signals as a source for data relating to the statistics being collected. 

13. (Previously Presented) A computer-implemented method as recited in claim 10, 
wherein the automatic probabilistic event detection system is an automatic character recognition 
system. 



14. (Previously Presented) A computer-implemented method as recited in claim 10, 
wherein the alternate output sequence includes at least one of 

an alternate recognition score for at least one of the words, 
at least one alternate word that may have been one detectable event that transpired, 
the at least one alternate word along with a recognition score for the at least one 
alternate word, 

at least one alternate sequence of words that may have been another detectable event 
that transpired, 

the at least one alternate sequence of words along with a recognition score for at least 
one word that is part of the at least one alternate sequence of words, 
an indication that no detectable event has transpired, 

a word lattice describing a plurality of alternatives for detectable word sequences, and 
the word lattice along with a recognition score for at least one among 
at least one word in the detectable word sequences, 
at least one path in the word lattice, and 
at least one edge in the word lattice. 

15. (Previously Presented) A computer-implemented method as recited in claim 1 , 
wherein said using comprises: 

building a first model modeling behavior of the automatic system as a process with at 
least one inner state, which may be unrelated to inner states of the automatic system, and 
inferring the at least one inner state of the process from the observed outputs of the automatic 
system; 

building a second model, based on the statistics obtained by said collecting, to infer data 
to at least one of supplement and replace at least part of the original output sequence from the 
at least one inner state of the process in the first model; 

combining the first and second models to form a function for converting the original 
output sequence into the alternate output sequence; and 

using the function on the original output sequence of the automatic system to create the 
alternate output sequence. 

16. (Previously Presented) A computer-implemented method as recited in claim 15, 
further comprising repeating said using of the function on different original output sequences of 
the automatic system to create additional alternate output sequences. 



17. (Previously Presented) A computer-implemented method as recited in claim 15, 
wherein the process in said first model is one of a Generalized Hidden Markov process and a 
special case of a Generalized Hidden Markov process. 

18. (Previously Presented) A computer-implemented method as recited in claim 15, 
wherein the second model is a parametric model, and 

wherein said building of the second model uses at least one direct parametric estimation 
technique for inferring from at least one of the inner states. 

19. (Previously Presented) A computer-implemented method as recited in claim 18, 
wherein the at least one direct parametric estimation technique includes at least one of maximal 
likelihood estimation and entropy maximization. 

20. (Previously Presented) A computer-implemented method as recited in claim 15, 
wherein for at least one of the inner states said building of the second model uses at least one 
estimation technique utilizing information estimated for other inner states. 

21 . (Previously Presented) A computer-implemented method as recited in claim 20, 
wherein the at least one estimation technique utilizes at least one of a mixture model and kernel- 
based learning. 

22. (Previously Presented) A computer-implemented method as recited in claim 15, 
wherein said building of the first and second models assumes the inner states of the process to 
be fully determined by the observed outputs during at least one point in time. 

23. (Previously Presented) A computer-implemented method as recited in claim 22, 
wherein said building of the first and second models assumes the inner states of the process 
during at least one point in time to be fully determined by a subset of the observed outputs that 
includes at least an identity of at least one event detected by the automatic system. 

24. (Previously Presented) A computer-implemented method as recited in claim 15, 
wherein said building of at least one of the first and second models uses at least one 
discretization function. 



25. (Previously Presented) A computer-implemented method as recited in claim 15, 
wherein at least one of said building and combining uses Bayesian methods. 

26. (Previously Presented) A computer-implemented method as recited in claim 1 , 
further comprising repeating said collecting on several statistically different training materials. 

27. (Previously Presented) A computer-implemented method as recited in claim 26, 
wherein said collecting uses samples of statistically different sets of materials as initial training 
material. 

28. (Previously Presented) A computer-implemented method as recited in claim 26, 
further comprising identifying parameters that remain invariant between the statistically different 
sets of materials. 

29. (Previously Presented) A computer-implemented method as recited in claim 28, 
wherein said identifying improves estimation of at least one of the parameters. 

30. (Previously Presented) A computer-implemented method as recited in claim 28, 
wherein said identifying is used to enable training when available statistically self-similar sets of 
materials are too small to allow effective training. 

31 . (Previously Presented) A computer-implemented method as recited in claim 28, 
wherein said identifying is used to increase effectiveness of further training on material that is 
not statistically similar to initial training material. 

32. (Previously Presented) A computer-implemented method as recited in claim 1 , 
wherein material used for said collecting is statistically similar to material used during said using. 

33. (Currently Amended) At least one computer readable medium storing instructions for 
controlling at least one computer system to perform a method of processing outputs of an 
automatic system for probabilistic detection of events, comprising: 

collecting statistics related to observed outputs of the automatic system comprising: 



providing at least one input sequence to the automatic system, the input 
sequence associated with a transcript; 

observing an output sequence of the automatic system generated in response to 
the provided at least one input sequence; and 

comparing the output sequence with the transcript; and 
using the statistics to process automatically an original output sequence of the automatic 
system and produce an alternate output sequence and a confidence assessment regarding 
parts of at least one of the original output sequence and the alternate output seguence , the 
process including [[by]] at least one of automatically supplementing and replacing at least part of 
the original output seguence with the alternate output seguence in accordance with the 
confidence assessment . 

34. (Original) At least one computer readable medium as recited in claim 33, wherein at 
least part of the alternate output sequence contains information that can be used by systems 
that can use the at least part of the Original output sequence directly. 

35. (Original) At least one computer readable medium as recited in claim 34, wherein 
data in the alternate output sequence includes confidence assessments regarding parts of at 
least one of the Original and alternate output sequences, where the confidence assessments 
supplement data in the Original output sequence. 

36. (Canceled). 

37. (Original) At least one computer readable medium as recited in claim 33, wherein 
the alternate output sequence includes information of a plurality of alternatives that can replace 
at least part of the Original output sequence that can be used by systems that can use the at 
least part of the Original output sequence directly. 

38. (Original) At least one computer readable medium as recited in claim 37, wherein 
data in the alternate output sequence includes confidence assessments regarding parts of the 
alternatives, where the confidence assessments supplement data in the Original output 
sequence. 



39. (Original) At least one computer readable medium as recited in claim 37, wherein 
data in the alternate output sequence includes confidence assessments regarding parts of the 
alternatives, where the confidence assessments replace at least part of the Original output 
sequence. 

40. (Canceled) 

41. (Canceled) 

42. (Original) At least one computer readable medium as recited in claim 33, wherein 
the detected events involve word recognition. 

43. (Original) At least one computer readable medium as recited in claim 42, wherein 
the automatic system is an automatic speech recognition system. 

44. (Original) At least one computer readable medium as recited in claim 43, 
wherein the automatic speech recognition system operates on low-grade audio signals 

having word recognition precision below 50 percent; and 

wherein said method further comprises utilizing human transcription of the low-grade 
audio signals as a source for data relating to the statistics being collected. 

45. (Original) At least one computer readable medium as recited in claim 42, wherein 
the automatic probabilistic event detection system is an automatic character recognition system. 

46. (Original) At least one computer readable medium as recited in claim 42, wherein 
the alternate output sequence includes at least one of 

an alternate recognition score for at least one of the words, 
at least one alternate word that may have been one detectable event that transpired, 
the at least one alternate word along with a recognition score for the at least one 
alternate word, 

at least one alternate sequence of words that may have been another detectable event 
that transpired, 

the at least one alternate sequence of words along with a recognition score for at least 
one word that is part of the at least one alternate sequence of words, 



an indication that no detectable event has transpired, 

a word lattice describing a plurality of alternatives for detectable word sequences, and 
the word lattice along with a recognition score for at least one among 
at least one word in the detectable word sequences, 
at least one path in the word lattice, and 
at least one edge in the word lattice. 

47. (Original) At least one computer readable medium as recited in claim 33, wherein 
said using comprises: 

building a first model modeling behavior of the automatic system as a process with at 
least one inner state, which may be unrelated to inner states of the automatic system, and 
inferring the at least one inner state of the process from the observed outputs of the automatic 
system; 

building a second model, based on the statistics obtained by said collecting, to infer data 
to at least one of supplement and replace at least part of the Original output sequence from the 
at least one inner state of the process in the first model; 

combining the first and second models to form a function for converting the Original 
output sequence into the alternate output sequence; and 

using the function on the Original output sequence of the automatic system to create the 
alternate output sequence. 

48. (Original) At least one computer readable medium as recited in claim 47, further 
comprising repeating said using of the function on different Original output sequences of the 
automatic system to create additional alternate output sequences. 

49. (Original) At least one computer readable medium as recited in claim 47, wherein 
the process in said first model is one of a Generalized Hidden Markov process and a special 
case of a Generalized Hidden Markov process. 

50. (Original) At least one computer readable medium as recited in claim 47, 
wherein the second model is a parametric model, and 

wherein said building of the second model uses at least one direct parametric estimation 
technique for inferring from at least one of the inner states. 



51 . (Original) At least one computer readable medium as recited in claim 50, wherein 
the at least one direct parametric estimation technique includes at least one of maximal 
likelihood estimation and entropy maximization. 

52. (Original) At least one computer readable medium as recited in claim 47, wherein for 
at least one of the inner states said building of the second model uses at least one estimation 
technique utilizing information estimated for other inner states. 

53. (Original) At least one computer readable medium as recited in claim 52, wherein 
the at least one estimation technique utilizes at least one of a mixture model and kernel-based 
learning. 

54. (Original) At least one computer readable medium as recited in claim 47, wherein 
said building of the first and second models assumes the inner states of the process to be fully 
determined by the observed outputs during at least one point in time. 

55. (Original) At least one computer readable medium as recited in claim 54, wherein 
said building of the first and second models assumes the inner states of the process during at 
least one point in time to be fully determined by a subset of the observed outputs that includes at 
least an identity of at least one event detected by the automatic system. 

56. (Original) At least one computer readable medium as recited in claim 47, wherein 
said building of at least one of the first and second models uses at least one discretization 
function. 

57. (Original) At least one computer readable medium as recited in claim 47, wherein at 
least one of said building and combining uses Bayesian methods. 

58. (Original) At least one computer readable medium as recited in claim 33, further 
comprising repeating said collecting on several statistically different training materials. 

59. (Original) At least one computer readable medium as recited in claim 58, wherein 
said collecting uses samples of statistically different sets of materials as initial training material. 



60. (Original) At least one computer readable medium as recited in claim 59, further 
comprising identifying parameters that remain invariant between the statistically different sets of 
materials. 

61 . (Original) At least one computer readable medium as recited in claim 60, wherein 
said identifying improves estimation of at least one of the parameters. 

62. (Original) At least one computer readable medium as recited in claim 60, wherein 
said identifying is used to enable training when available statistically self-similar sets of materials 
are too small to allow effective training. 

63. (Original) At least one computer readable medium as recited in claim 60, wherein 
said identifying is used to increase effectiveness of further training on material that is not 
statistically similar to initial training material. 

64. (Original) At least one computer readable medium as recited in claim 33, wherein 
material used for said collecting is statistically similar to material used during said using. 

65. (Currently Amended) An apparatus for processing outputs of an automatic system 
for probabilistic detection of events, comprising: 

collection means for collecting statistics related to observed outputs of the automatic 
system comprising: 

providing means for providing at least one input sequence to the automatic 
system, the input sequence associated with a transcript; 

observing means for observing an output sequence of the automatic system 
generated in response to the provided at least one input sequence; and 

comparing means for comparing the output sequence with the transcript; and 
processing means for using the statistics to process automatically an Original output 
sequence of the automatic system and produce an alternate output sequence and a confidence 
assessment regarding parts of at least one of the Original output sequence and the alternate 
output sequence , the process including [[by]] at least one of automatically supplementing and 
replacing at least part of the original Output sequence with the alternate output seguence in 
accordance with the confidence assessment . 



66. (Original) An apparatus as recited in claim 65, wherein the alternate output 
sequence includes information of a plurality of alternatives that can replace at least part of the 
Original output sequence that can be used by systems that can use the at least part of the 
Original output sequence directly. 

67. (Original) An apparatus as recited in claim 66, wherein data in the alternate output 
sequence includes confidence assessments regarding parts of the alternatives, where the 
confidence assessments supplement data in the Original output sequence. 

68. (Canceled). 

69. (Original) An apparatus as recited in claim 65, wherein the detected events involve 
word recognition. 

70. (Original) An apparatus as recited in claim 69, wherein the automatic system is an 
automatic speech recognition system. 

71. (Canceled) 

72. (Original) An apparatus as recited in claim 69, wherein the alternate output 
sequence includes at least one of 

an alternate recognition score for at least one of the words, 
at least one alternate word that may have been one detectable event that transpired, 
the at least one alternate word along with a recognition score for the at least one 
alternate word, 

at least one alternate sequence of words that may have been another detectable event 
that transpired, 

the at least one alternate sequence of words along with a recognition score for at least 
one word that is part of the at least one alternate sequence of words, 
an indication that no detectable event has transpired, 

a word lattice describing a plurality of alternatives for detectable word sequences, and 
the word lattice along with a recognition score for at least one among 
at least one word in the detectable word sequences, 



at least one path in the word lattice, and 
at least one edge in the word lattice. 

73. (Original) An apparatus as recited in claim 65, wherein said processing means 
comprises: 

first model means for building a first model modeling behavior of the automatic system as 
a process with at least one inner state, which may be unrelated to inner states of the automatic 
system, and inferring the at least one inner state of the process from the observed outputs of the 
automatic system; 

second model means for building a second model, based on the statistics obtained by 
said collection means, to infer data to at least one of supplement and replace at least part of the 
Original output sequence from the at least one inner state of the process in the first model; 

combination means for combining the first and second models to form a function for 
converting the Original output sequence into the alternate output sequence; and 

function means for applying the function to the Original output sequence of the automatic 
system to create the alternate output sequence. 

74. (Original) An apparatus as recited in claim 73, wherein said function means applies 
the function on different Original output sequences of the automatic system to create additional 
alternate output sequences. 

75. (Original) An apparatus as recited in claim 73, wherein the process in said first 
model is one of a Generalized Hidden Markov process and a special case of a Generalized 
Hidden Markov process. 

76. (Original) An apparatus as recited in claim 73, 
wherein the second model is a parametric model, and 

wherein said second model means uses at least one direct parametric estimation 
technique for inferring from at least one of the inner states. 

77. (Original) An apparatus as recited in claim 73, wherein said second model means, 
for at least one of the inner states, uses at least one estimation technique utilizing information 
estimated for other inner states. 



78. (Original) An apparatus as recited in claim 77, wherein the at least one estimation 
technique utilizes at least one of a mixture model and kernel-based learning. 

79. (Currently Amended) A system for processing outputs of an automatic system for 
probabilistic detection of events, comprising: 

an interface to receive observed outputs from the automatic system; and 
at least one processor programmed to: 

collect statistics related to the observed outputs of the automatic system by: 

providing at least one input sequence to the automatic system, the input 
sequence associated with a transcript; 

observing an output sequence of the automatic system generated in response to 
the provided at least one input sequence; and 

comparing the output sequence with the transcript; and 
use the statistics to automatically produce an alternate output sequence and a 
confidence assessment regarding parts of at least one of the Original output sequence and the 
alternate output sequence , thereafter [[by]] at least one of automatically supplementing and 
replacing at least part of the Original output sequence of the automatic system with the alternate 
output sequence in accordance with the confidence assessment . 

80. (Original) A system as recited in claim 79, wherein at least part of the alternate 
output sequence includes information of a plurality of alternatives that can replace at least part of 
the Original output sequence that can be used by systems that can use the at least part of the 
Original output sequence directly. 

81 . (Original) A system as recited in claim 79, wherein data in the alternate output 
sequence includes confidence assessments regarding parts of the alternatives, where the 
confidence assessments supplement data in the Original output sequence. 

82. (Canceled). 

83. (Original) A system as recited in claim 79, wherein the detected events involve word 
recognition. 



84. (Original) A system as recited in claim 83, wherein the automatic system is an 
automatic speech recognition system. 

85. (Original) A system as recited in claim 84, 

wherein the automatic speech recognition system operates on low-grade audio signals 
having word recognition precision below 50 percent; and 

wherein said interface further receives human transcription of the low-grade audio 
signals for use by said processor as data relating to the statistics being collected. 

86. (Original) A system as recited in claim 83, wherein the alternate output sequence 
includes at least one of 

an alternate recognition score for at least one of the words, 
at least one alternate word that may have been one detectable event that transpired, 
the at least one alternate word along with a recognition score for the at least one 
alternate word, 

at least one alternate sequence of words that may have been another detectable event 
that transpired, 

the at least one alternate sequence of words along with a recognition score for at least 
one word that is part of the at least one alternate sequence of words, 
an indication that no detectable event has transpired, 

a word lattice describing a plurality of alternatives for detectable word sequences, and 
the word lattice along with a recognition score for at least one among 
at least one word in the detectable word sequences, 
at least one path in the word lattice, and 
at least one edge in the word lattice. 

87. (Original) A system as recited in claim 79, wherein said processor is programmed to 
build a first model modeling behavior of the automatic system as a process with at least one 
inner state, which may be unrelated to inner states of the automatic system, and inferring the at 
least one inner state of the process from the observed outputs of the automatic system, to build 
a second model, based on the statistics obtained, to infer data to at least one of supplement and 
replace at least part of the Original output sequence from the at least one inner state of the 
process in the first model, to combine the first and second models to form a function for 
converting the Original output sequence into the alternate output sequence, and to apply the 



function to the Original output sequence of the automatic system to create the alternate output 
sequence. 

88. (Original) A system as recited in claim 87, wherein said processor applies the 
function on different Original output sequences of the automatic system to create additional 
alternate output sequences. 

89. (Original) A system as recited in claim 87, wherein the process in said first model is 
one of a Generalized Hidden Markov process and a special case of a Generalized Hidden 
Markov process. 

90. (Original) A system as recited in claim 87, 
wherein the second model is a parametric model, and 

wherein said processor builds the second model using at least one direct parametric 
estimation technique for inferring from at least one of the inner states. 

91 . (Original) A system as recited in claim 87, wherein said processor, for at least one of 
the inner states, uses at least one estimation technique utilizing information estimated for other 
inner states. 

92. (Original) A system as recited in claim 91 , wherein said processor is programmed to 
utilize at least one of a mixture model and kernel-based learning as the at least one estimation 
technique. 



